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I.  SUMMARY 

Work  during  the  present  quarter  ( January-March  1952)  has 
befen  marked  by  a discovery  that  radically  alters  the  basic 
Computation  techniques  for  the  above  project®  The  new  method 
involves  a simple  means  of  expressing  the  covariances  of  the 
order  ^statistics  in  a random  sample  of  n observations  from  the 
extreme -value  distribution  in  terms  of  tabulated  functions® 

As  a result,  the  scope  of  the  results  to  be  expected  can  be 
greatly  extended  without  much*  if  any*  increase  in  cost  over 
what  had  been  planned. 

The  present  report  gives  a brief  summary  of  what  has  been 
accomplished  and  what  can  be  expected  under  the  new  techniques, 
and  includes  tentative  plans  for  preparing  the  results  for 
publication,  This  report  presents  only  the  main  lines  of  de- 
velopment and  avoids  most  of  the  mathematical  details,  since 
it  is  intended  to  serve  as  a basis  for  discussion  of  further 
work  by  representatives  of  the  National  Bureau  of  Standards  and 
the  National  Advisory  Committee  for  Aeronautics  at  a meeting  to 
be  arranged  in  the  near  future®  The  balance  of  work  under  this 
project,  as  outlined  herein,  is  to  be  planned  with  the  goal  of 
completion  by  the  end  of  FY  1952® 

II o DESCRIPTION  OF  RECENT  WORK 

Ao  Background® 

The  specific  objective  is  to  obtain  a statistical 
function  which  will  provide  improved  methods  of  estimating 
maximum  values  of  acceleration  increments  and  gust  veloci- 
ties which  may  be  expected  by  an  airplane  in  flight® 
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By  maximum  acceleration  increment  is  meant  the 
largest  value  occurring  during  a single  flight  of  a given 
airplane o If  a series  of  n flights  of  the  seme  plane  are 
considered,  thon  there  will  be  a maximum  value  X for  each 
and  the  set  of  the  n maxima,  X^,  Xg,  0«o,  Xn,  constitutes 
a sample  of  n observations  to  be  analyzed c One  objective 
of  the  analysis  is  to  predict  a value  such  that  on©  may, 
in  a long  series  of  flights,  expect  that  the  proportion  P 
of  the  flight  maxima  will  not  exceed  this  value,  while 
the  remaining  proportion,  1 ^ P,  will  exceed  (or  equal) 
it0  This  upper  limit  is  designated  as  Xp  and  naturally 
depends  upon  P©  For  example,  if  we  want  to  estimate  a 
limit  Xp  such  that  in  only  a very  small  proportion  (1  « P) 
of  the  flights  will  a larger  value  occur,  then  Xp  must  b© 
expected  to  be  quite  large©  The  limit  Xp  is  not  known 
but  must  be  estimated  from  sample  data  such  as  the  set  of 
n maxima  Xi,  Xz»  Xn  mentioned  above© 

The  method  of  estimation  is  to  find  a function 
f ~ f(X^,  X2»  •• e*  Xn)  of  these  n variables,  called 
an  estimator  of  Xp,  which  conforms  to  the  following  two 
desirable  characteristics  as  closely  as  possibles 

(1)  The  estimator  is  unbiased;  E[f(Xl,  X2,  o©os  Xh) 3 
= Xp,  where  E denotes  mathematical  expectation * This 
means  that  the  estimator  f fluctuates  about  a long- 
run  average  which  is  the  correct  value,  Xp© 

(2)  The  estimator  is  most  efficient,  that  is,  has 

minimum  variance:  02(f)  = E(f-Xp)^  , where  o2 denotes 

variance©  This  means  that  the  values  taken  by  the 
estimator  f are  concentrated  so  closely  about  the 
desired  true  value  Xp  that  of  all  unbiased  estimators 
of  Xp  it  has  minimum  mean  squared  error© 

It  Is  clear  that  the  values  of  a function  f possessing 
the  above  two  properties  may  be  expected  to  give  very 
satisfactory  estimates  of  the  unknown  value  Xp  © However, 

In  order  to  apply  these  two  properties,  some  assumption 
must  be  made  about  the  form  of  the  statistical  population 
from  which  the  observations  are  assumed  to  come©  The  fact 
that  each  observation  is  itself  a maximum  of  many  individual 
values  encountered  In  an  individual  flight,  together  with 
other  supporting  data 'discussed  by  Press  (reference  1), 
gives  theoretical  ground  for  assuming  the  underlying  popu- 
lation to  be  of  the  extreme-value  type  studied  by  Dr©  E©  J® 
Gumbel,  namely  F(x)  * exp  (»e“y).  y 3 a(x-u) , where  F(x) 
denotes  the  cumulative  distribution  function. 
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Methods  of  estimation  in  present  use  by  the  NACA 
involve  the  mean  and  standard  deviation  of  a sample  of 
given  data*  These  methods  have  been  developed  principally 
by  Dr®  Gumbo 1 ® They  are  not  very  complicated*  but  have 
th©  disadvantage  that  their  bias  and  efficiency  have  not 
been  evaluated  because  of  the  great  amount  of  calculation 
that  would  be  necessary  in  order  to  determine  th©  expected 
values  and  variances  of  the  functions  involved  in  Dr® 
Gurobel ? s estimators® 


B 


Current 


Research  to  date  under  the  present  project  has  con- 
centrated on  building  up  a simple  type  of  estimator  f 
which  is  a linear  function  of  the  order  statistics  of  the 

sample®  That  is*  if  we  arrange  the  n observations  in  in- 

creasing order  of  size  and  let  x^  denote  the  smallest  and 
Xn  the  largest,  then  the  sample  may  be  represented  by 

( X\  , , & o e>  , Xj^  ) , — X2  ®c-«»  ^ Xjj  , 

where  the  xi  are  called  order  statistics  of  the  sample, 
and  the  linear  estimator  sought  is  of  the  form 

Tn  85  wlxl  ♦ W2X2  ♦ • « • + wnxn  ° 


The  weights  w^  are  to  be  determined  so  as  to  satisfy  con- 
ditions (1)  and  (2)  above,  namely: 

n 

(1)  Tn  is  unbiased:  2 wjExi  = Xp  ; 


^ n n 

(2)  Tn  has  minimum  variance:  o2(T  ) * 2 2 oijwiwj  , 

j~l  i«l 

is  to  be  a minimum,  where  the  coefficients  denote 
the  variances  and  covariances  of  the  order  statistics 
of  the  sample  of  n. 


The  means,  Ex^,  in  the  foregoing  are  the  first  moments 
of  ranked  extremes  and  have  already  been  tabulated  in 
reference  2®  It  had  been  planned  to  compute  the  variances 
and  covariances  by  numerical  integration  for  sample  sizes 
up  to  n * 10,  above  which  point  the  computation  would  have 
become  too  costly® 


The  discovery  during  the  present  quarter  of  a method 
for  expressing  the  ojj  explicitly  in  terms  of  tabulated 

functions  considerably  reduces  the  amount  of  computation 
necessary  for  small  samples® 
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Larger  samples  can  be  handled  by  breaking  them  into 
smaller  samples,  obtaining  an  estimate  of  the  desired 
quantity  from  each  one  of  these  samples,  and  pooling  the 
results®  In  this  manner  samples  of  any  size  can  be  handled 
by  the  techniques  developed  for  very  small  samples e This 
method  of  subgroups  also  has  the  advantage  of  making 
possible  a control  chart  procedure  whereby  internal  con~ 
sistsncy  of  the  data  and  stability  of  operating  conditions 
could  be  checked*  The  method  is  also  especially  well 
adapted  to  the  form  of  estimator  being  investigated,  and 
would  not  be  applicable  to  more  complicated  functions, 
such  as  those  of  Gumbel. 

The  new  method  has  been  rendered  still  more  powerful 
by  a refinement  devised  by  Mr.  I.  Richard  Savage  of  the 
Statistical  Engineering  Laboratory*  This  refinement  not 
only  makes  it  possible  to  determine  the  unbiased  estimators 
for  all  probability  levels  p simultaneously,  rather  than 
for  just  a few  selected  levels  such  as  P - *95,  *99  * but 
with  very  little  additional  effort  yields  also  the  estima- 
tors  of  the  two  parameters  which  make  it  possible  to  fit 
an  extreme-value  distribution  to  a given  set  of  extreme 
data.  The  method  also  automatically  furnishes  the  effici- 
ency associated  with  the  estimate  obtained. 

The  method  has  been  tried  out  on  a preliminary  basis 
with  encouraging  results.  Unbiased  estimators  have  been 
found  for  samples  of  any  size  based  on  splitting  the 
sample  into  subgroups  of  n = 2 and  3.  If  subgroups  of  3 
are  used  Instead  of  subgroups  of  2,  the  improvement  in 
efficiency  is  represented  by  an  increase  of  1$  percentage 
points,  on  the  basis  of  ICO  percentage  efficiency  of  the 
theoretically  most  efficient  estimator.  If  we  proceed  from 
n “ 3 to  n * If,  etc.,  further  jumps  In  efficiency  are 
expected  to  take  place,  with  the  result  that  by  the  time 
n = 6 or  7 we  should  reach  80  or  90  percent  efficiency, 
which  is  probably  adequate  for  the  purpose  in  view. 

Co  Proposed  further  work. 

V/ork  thus  far  points  to  immediate  continuation  along 
the  following  lines: 

(1)  Computation  of  "subgroup”  estimators  described 
above  for  n *^5#  6,  7*  say,  in  order  to  reach  a 

' practical  lev&l  of  efficiency 0 

(2)  Evaluation  of  the  bias  and  variance  of  the  present 
Gumbel-type  estimators  for  comparison  with  the  proposed 
estimators  at  least  for  samples  of  moderate  size.  It 
is  proposed  that  this  be  accomplished  by  methods  of 
empirical  sampling  for  n ® 10,  209  30o 
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(3)  Comparison  of  the  results  in  (1)  and  (2)  w5.th 
asymptotic  theory* 

Asymptotic  theory  for  large  samples  has  been 
discussed  in  NBS  Report  1129#  prepared  under  the 
NACA  project  (reference  3)  <►  It  was  there  shown 
how  to  select  as  few  as  three  out  of  a large 
number  n of  sample  values,  which  yield  unbiased 
estimators  of  surprisingly  good  efficiency 0 If  it 
turns  out  that  asymptotic  theory  compares  favorably 
with  exact  methods  even  for  n as  low  as  10,  then 
an  improvement  in  procedures  might  become  possible 
for  larger  values  of  nc 

IXX0  PLANS  FOR  PUBLICATION 

The  following  means  of  writing  up  the  various  aspects  of 
the  work  are  envisioned,  subject  to  the  usual  NACA  clearance 

procedures: 

(1)  A mathematical  paper  presenting  the  derivation  of  the 
formula  for  the  evaluation  of  the  double  integrals 
mentioned  above,  together  with  related  mathematical 
results  o 

(2)  A technical  report  to  NACA,  somewhat  in  the  nature  of 
a manual,  which  would  express  the  new  method  in 
practical  terms  and  would  be  written  for  the  use  of 
engineering  personnel  and  field  workers 0 This  report 
might  also  give  a general  indication  of  the  nature  and 
characteristics  of  the  method 0 


Julius  Lieblein 
Statistical  Engineering 
Laboratory 


National  Bureau  of  Standards 
Washington,  D*  Co 
14  March  1952 
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